Fusing Text and Image for Event Detection in Twitter

نویسندگان

  • Samar M. Alqhtani
  • Suhuai Luo
  • Brian Regan
چکیده

In this contribution, we develop an accurate and effective event detection method to detect events from a Twitter stream, which uses visual and textual information to improve the performance of the mining process. The method monitors a Twitter stream to pick up tweets having texts and images and stores them into a database. This is followed by applying a mining algorithm to detect an event. The procedure starts with detecting events based on text only by using the feature of the bag-of-words which is calculated using the term frequency-inverse document frequency (TF-IDF) method. Then it detects the event based on image only by using visual features including histogram of oriented gradients (HOG) descriptors, grey-level cooccurrence matrix (GLCM), and color histogram. K nearest neighbours (Knn) classification is used in the detection. The final decision of the event detection is made based on the reliabilities of text only detection and image only detection. The experiment result showed that the proposed method achieved high accuracy of 0.94, comparing with 0.89 with texts only, and 0.86 with images only.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Event Detection in Twitter Using Text and Image Fusion

In this paper, we describe an accurate and effective event detection method to detect events from Twitter stream. It detects events using visual information as well as textual information to improve the performance of the mining. It monitors Twitter stream to pick up tweets having texts and photos and stores them into database. Then it applies mining algorithm to detect the event. Firstly, it d...

متن کامل

A Saliency Detection Model via Fusing Extracted Low-level and High-level Features from an Image

Saliency regions attract more human’s attention than other regions in an image. Low- level and high-level features are utilized in saliency region detection. Low-level features contain primitive information such as color or texture while high-level features usually consider visual systems. Recently, some salient region detection methods have been proposed based on only low-level features or hig...

متن کامل

Multimodal Event Detection in Twitter Hashtag Networks

Event detection in a multimodal Twitter dataset is considered. We treat the hashtags in the dataset as instances with two modes: text and geolocation features. The text feature consists of a bag-of-words representation. The geolocation feature consists of geotags (i.e., geographical coordinates) of the tweets. Fusing the multimodal data we aim to detect, in terms of topic and geolocation, the i...

متن کامل

What Is New in Our City? A Framework for Event Extraction Using Social Media Posts

Post streams from public social media platforms such as Instagram and Twitter have become precious but noisy data sources to discover what is happening around us. In this paper, we focus on the problem of detecting and presenting local events in real time using social media content. We propose a novel framework for real-time city event detection and extraction. The proposed framework first appl...

متن کامل

Document Image Dewarping Based on Text Line Detection and Surface Modeling (RESEARCH NOTE)

Document images produced by scanner or digital camera, usually suffer from geometric and photometric distortions. Both of them deteriorate the performance of OCR systems. In this paper, we present a novel method to compensate for undesirable geometric distortions aiming to improve OCR results. Our methodology is based on finding text lines by dynamic local connectivity map and then applying a l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1503.03920  شماره 

صفحات  -

تاریخ انتشار 2015